499 research outputs found
One-shot Neural Backdoor Erasing via Adversarial Weight Masking
Recent studies show that despite achieving high accuracy on a number of
real-world applications, deep neural networks (DNNs) can be backdoored: by
injecting triggered data samples into the training dataset, the adversary can
mislead the trained model into classifying any test data to the target class as
long as the trigger pattern is presented. To nullify such backdoor threats,
various methods have been proposed. Particularly, a line of research aims to
purify the potentially compromised model. However, one major limitation of this
line of work is the requirement to access sufficient original training data:
the purifying performance is a lot worse when the available training data is
limited. In this work, we propose Adversarial Weight Masking (AWM), a novel
method capable of erasing the neural backdoors even in the one-shot setting.
The key idea behind our method is to formulate this into a min-max optimization
problem: first, adversarially recover the trigger patterns and then (soft) mask
the network weights that are sensitive to the recovered patterns. Comprehensive
evaluations of several benchmark datasets suggest that AWM can largely improve
the purifying effects over other state-of-the-art methods on various available
training dataset sizes.Comment: Accepted by NeurIPS 2022 (19 pages, 6 figures, 10 tables
Testing for a Common Volatility Process and Information Spillovers in Bivariate Financial Time Series Models
The paper considers the problem as to whether financial returns have a common volatility process in the framework of stochastic volatility models that were suggested by Harvey et al. (1994). We propose a stochastic volatility version of the ARCH test proposed by Engle and Susmel (1993), who investigated whether international equity markets have a common volatility process. The paper also checks the hypothesis of frictionless cross-market hedging, which implies perfectly correlated volatility changes, as suggested by Fleming et al. (1998). The paper uses the technique of Chesher (1984) in differentiating an integral that contains a degenerate density function in deriving the Lagrange Multiplier test statistic
Testing for volatility co-movement in bivariate stochastic volatility models
The paper considers the problem of volatility co-movement, namely as to whether two financial returns have perfectly correlated common volatility process, in the framework of multivariate stochastic volatility models and proposes a test which checks the volatility co-movement. The proposed test is a stochastic volatility version of the co-movement test proposed by Engle and Susmel (1993), who investigated whether international equity markets have volatility co-movement using the framework of the ARCH model.
In empirical analysis we found that volatility co-movement exists among closelylinked stock markets and that volatility co-movement of the exchange rate markets tends to be found when the overall volatility level is low, which is contrasting to the often-cited finding in the financial contagion literature that financial returns have co-movement in the level during the financial crisis
Do Language Models Plagiarize?
Past literature has illustrated that language models (LMs) often memorize
parts of training instances and reproduce them in natural language generation
(NLG) processes. However, it is unclear to what extent LMs "reuse" a training
corpus. For instance, models can generate paraphrased sentences that are
contextually similar to training samples. In this work, therefore, we study
three types of plagiarism (i.e., verbatim, paraphrase, and idea) among GPT-2
generated texts, in comparison to its training data, and further analyze the
plagiarism patterns of fine-tuned LMs with domain-specific corpora which are
extensively used in practice. Our results suggest that (1) three types of
plagiarism widely exist in LMs beyond memorization, (2) both size and decoding
methods of LMs are strongly associated with the degrees of plagiarism they
exhibit, and (3) fine-tuned LMs' plagiarism patterns vary based on their corpus
similarity and homogeneity. Given that a majority of LMs' training data is
scraped from the Web without informing content owners, their reiteration of
words, phrases, and even core ideas from training sets into generated texts has
ethical implications. Their patterns are likely to exacerbate as both the size
of LMs and their training data increase, raising concerns about
indiscriminately pursuing larger models with larger training corpora.
Plagiarized content can also contain individuals' personal and sensitive
information. These findings overall cast doubt on the practicality of current
LMs in mission-critical writing tasks and urge more discussions around the
observed phenomena. Data and source code are available at
https://github.com/Brit7777/LM-plagiarism.Comment: Accepted to WWW'2
Unbiased Math Word Problems Benchmark for Mitigating Solving Bias
In this paper, we revisit the solving bias when evaluating models on current
Math Word Problem (MWP) benchmarks. However, current solvers exist solving bias
which consists of data bias and learning bias due to biased dataset and
improper training strategy. Our experiments verify MWP solvers are easy to be
biased by the biased training datasets which do not cover diverse questions for
each problem narrative of all MWPs, thus a solver can only learn shallow
heuristics rather than deep semantics for understanding problems. Besides, an
MWP can be naturally solved by multiple equivalent equations while current
datasets take only one of the equivalent equations as ground truth, forcing the
model to match the labeled ground truth and ignoring other equivalent
equations. Here, we first introduce a novel MWP dataset named UnbiasedMWP which
is constructed by varying the grounded expressions in our collected data and
annotating them with corresponding multiple new questions manually. Then, to
further mitigate learning bias, we propose a Dynamic Target Selection (DTS)
Strategy to dynamically select more suitable target expressions according to
the longest prefix match between the current model output and candidate
equivalent equations which are obtained by applying commutative law during
training. The results show that our UnbiasedMWP has significantly fewer biases
than its original data and other datasets, posing a promising benchmark for
fairly evaluating the solvers' reasoning skills rather than matching nearest
neighbors. And the solvers trained with our DTS achieve higher accuracies on
multiple MWP benchmarks. The source code is available at
https://github.com/yangzhch6/UnbiasedMWP
- …